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Abstract 

Formal Concept Analysis (FCA) begins from a context, given as a binary relation between some objects and some 
attributes, and derives a lattice of concepts, where each concept is given as a set of objects and a set of attributes, 
such that the first set consists of all objects that satisfy all attributes in the second, and vice versa. Many applications, 
though, provide contexts with quantitative information, telling not just whether an object satisfies an attribute, but also 
quantifying this satisfaction. Contexts in this form arise as rating matrices in recommender systems, as occurrence 
matrices in text analysis, as pixel intensity matrices in digital image processing, etc. Such applications have attracted a 
lot of attention, and several numeric extensions of FCA have been proposed. We propose the framework of proximity 
sets (proxets), which subsume partially ordered sets (posets) as well as metric spaces. One feature of this approach 
is that it extracts from quantified contexts quantified concepts, and thus allows full use of the available information. 
Another feature is that the categorical approach allows analyzing any universal properties that the classical FCA and 
the new versions may have, and thus provides structural guidance for aligning and combining the approaches. 

1 Introduction 

Suppose that the users U - { Abby, Dusko, Stef, Temra, Luka) provide the following star ratings for the items / = 
{"Nemo", "Crash" , "Ikiru", "Bladerunner") 





"Nemo" 


"Crash" 


"Ikiru" 


"Bladerunner" 


Abby 


★ ★ ★ ★ 


★ ★ ★ ★ ★ 


★ ★ 


★ ★ ★ ★ 


Dusko 


★ ★ 


★ ★ 


★ ★ ★ ★ 


★ ★ ★ ★ ★ 


Stef 


★ ★ 


★ ★ ★ ★ ★ 


★ ★ ★ 




Temra 




★ ★ ★ 


★ ★ ★ 


★ ★ ★ ★ 


Luka 


★ ★ ★ ★ ★ 




★ 


★ ★ 



This matrix <1) = (0,„)yxj/ contains some information about the relations between these users' tastes, and about the 
relations between the styles of the items (in this case movies) that they rated. The task of data analysis is to extract 
that information. In particular, given a context matrix <I> : J x U ^ R like in the above table, the task of concept 
analysis is to detect, on one hand, the latent concepts of taste, shared by some of the users in U, and on the other hand 
the latent concepts of style, shared by some of the items in J. In Formal Concept Analysis (FCA) ll35l[T5l l9l fT4l[33l . 

the latent concepts are expressed as sets: a taste f is a set of users, i.e. a map U {0, 1}, whereas a style s is a set of 
items, i.e. a map J — » {0, 1}. We explore a slightly refined notion of concept, which tells not just whether two users 
(resp. two items) share the same taste (resp. style) or not, but it also quantifies the degree of proximity of their tastes 
(resp. styles). This is formalized by expressing a taste as a map U — > [0, 1], and a style as a map J [0, 1]. The value 
T„ is thus a number from the interval [0,1], telling how close is the taste t to the user u; whereas the value cr, tells 
how close is the item / to the style cr. These concepts are latent, in the sense that they are not given in advance, but 
mined from the context matrix, just like in FCA, and similarly like in Latent Semantic Analysis (LSA) fTOl. Although 
the extracted concepts are interpreted differently for the users in U and for the items in J (i.e. as the tastes and the 
styles, respectively) it turns out that the two obtained concept structures are isomorphic, just like in FCA and LSA. 
However, our approach allows initializing a concept analysis session by some prior concept structures, which allow 
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building upon the results of previous analyses, from other data sets, or specified by the analyst. This allows introducing 
diff'erent conceptual backgrounds for the users in U and for the items in J. 

Related work and background. The task of capturing quantitative data in FCA was recognized early on. The 
simplest approach is to preprocess any given numeric data into relational contexts by introducing thresholds, and 
then apply the standard FCA method lfT3l [TSl . This basic approach has been extended in several directions, e.g. 
Triadic Concept Analysis iflTl [TSl l26l and Pattern Structures lfT2l [19] |20l, and refined for many application domains. 
A different way to introduce numeric data into FCA is to allow fuzzy contexts, as binary relations evaluated in an 
abstract lattice of truth values L. The diff'erent ways to lift the FCA constructions along the inclusion {0,1} ^ L 
have led to an entire gamut of diff'erent versions of fuzzy FCA lO] |4] [T] |8] |23], surveyed in ||5]. With one notable 
exception, all versions of fuzzy FCA input quantitative data in the form as fuzzy relations, and output qualitative 
concept lattices in the standard form. The fact that numeric input data are reduced to the usual lattice outputs can be 
viewed as an advantage, since the outputs can then be presented, and interpreted, using the available FCA visualization 
tools and methods. On the other hand, only a limited amount of information contained in a numeric data set can be 
eff'ectively captured in lattice displays. The practices of spectral methods of concept analysis |I]|T0]|22l, pervasive 
in web commerce, show that the quantitative information received in the input contexts can often be preserved in the 
output concepts, and eff'ectively used in ongoing analyses. Our work has been motivated by the idea that suitably 
refined FCA constructions could output concept structures with useful quantitative information, akin to the concept 
eigenspaces of LSA. It turns out that the steps towards quantitative concepts on the FCA side have previously been 
made by Belohlavek in |4l, where fuzzy concept lattices derived from fuzzy contexts were proposed and analyzed. 
This is the mentioned notable exception from the other fuzzy and quantitative approaches to FCA, which all derive just 
qualitative concept lattices from quantitative contexts. Belohlavek's basic definitions turn out to be remarkably close to 
the definitions we start from in the present paper, in spite of the fact that his goal is to generalize FCA using carefully 
chosen fuzzy structures, whereas we use enriched categories with the ultimate goal to align FCA with the spectral 
methods for concept analysis, such as LSA. Does this confirm that the structures obtained in both cases naturally arise 
from the shared FCA foundations, rather than from either the fuzzy or the categorical approach? The ensuing analyses, 
however, shed light on these structures from essentially different angles, and open up completmentary views: while 
Belohlavek provides a detailed analysis of the internal structure of fuzzy concept lattices, we provide a high level view 
of their universal properties, from which some internal features follow, and which offers guidance through the maze 
of the available structural choices. Combining the two methods seems to open interesting alleys for future work. 

Our motivating example suggests that our goals might be related to those of ifTTl . where an FCA approach to rec- 
ommender systems was proposed. However, the authors of [11 1 use FCA to tackle the problem of partial information 
(the missing ratings) in recommender systems, and they abstract away the quantitative information (contained in the 
available ratings); whereas our goal is to capture this quantitative information, and we leave the problem of partial 
information aside for the moment. 

Outline of the paper. In Sec. |2] we introduce proximity sets (proxets), the mathematical formalism supporting 
the proposed generalization of FCA. Some constructions and notations used throughout the paper are introduced 
in Sec. 12.21 Since proxets generalize posets, in Sec.|3]we introduce the corresponding generalizations of infimum and 
supremum, and spell out the basic completion constructions, and the main properties of the infimum (resp. supremum) 
preserving morphisms. In Sec. we study context matrices over proximity sets, and describe their decomposition, 
with a universal property analogous to the Singular Value Decomposition of matrices in linear algebra. Restricting this 
decomposition from proxets to discrete posets (i.e. sets) yields FCA. The drawback of this quantitative version of FCA 
is that in it a finite context generally allows an infinite proxet of concepts, whereas in the standard version of FCA, of 
course, finite contexts lead to finite concept lattices. This problem is tackled in Sec.|5] where we show how the users 
and the items, as related in the context, induce a finite generating set of concepts. Sec. |6] provides a discussion of the 
obtained results and ideas for the future work. 
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2 Proxets 



2.1 Definition, intuition, examples 

Notation. Throughout the paper, the order and lattice structure of the interval [0, 1] are denoted by <, A and V, 
whereas • denotes the multiplication in it. 

Definition 2.1 A proximity over a set A is a map (i-):AxA— >[0, 1] which is 

• reflexive: ix \- x) - I, 

• transitive: (x h y) ■ (y h z) < {x V- z), and 

• antisymmetric: {x \- y) - \ - {y \- x) => x - y 

If only reflexity and transitivity are satisfied, and not antisymmetry, then we have an intensional proximity map. The 
antisymmetry condition is sometimes called extensionality. A( n intensional) proximity set, or proxet, is a set equipped 
with a(n intensional) proximity map. A proximity (or monotone) morphism between the proxets A and B is afimction 
f : A ^ B such that all x,y & A satisfy (x hy )^ < {fx 'r fy)g. We denote by Prox the category of proxets and their 
morphisms. 

Categorical view. A categorically minded reader can understand intensional proxets as categories enriched fT\\ over 
the poset [0, 1] viewed as a category, with the monoidal structure induced by the multiplication. In the presence of re- 
flexivity and transitivity, (x h y) = 1 is equivalent with Vz. (z I- x) < (z h y), and with Vz. (x h z) > Cv H z). A proximity 
map is thus asymmetric if and only if (Vz. (z H x) = (z i- y)) ^ x = y, and if and only if (Vz. (z H x) = (z H y)) => x = y. 
This means that extensional proxets correspond to skeletal [0, l]-enriched categories. 

2.1.1 Examples. 

The first example of a proxet is the interval [0, 1] itself, with the proximity 

^ -^''^I"-'] |]^ otherwise ^ 
Note that ( h ) : [0, 1] x [0, 1] — » [0, 1] is now an operation on [0,1], satisfying 

{x-y)<z ^ x<(3;hz) (2) 

A wide family of examples follows from the fact that proximity sets (proxets) generalize partially ordered sets 
(posets), in the sense that any poset S can be viewed as a proxet WS', with the proximity induced by the partial 

ordering c as follows: 

s 

[l ifxCy 

ix^y)^s - ' (3) 

1 otherwise 

The proxet W5 is intensional if and only if S is just a preorder, in the sense that the relation C is just transitive and 

reflexive. The other way around, any (intensional) proxet A induces two posets (resp. preorders), TA and AA, with the 
same underlying set and 

X C y <^ (x hy)^ = 1 X c y (x hy)^ > 

TA AA 

Since the constructions W, T and A, extended on maps, preserve monotonicity, a categorically minded reader can 
easily confirm that we have three functors, which happen to form two adjunctions A H W H T : Prox — > Pos. Since 
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W : Pos ^ Prox is an embedding, Pos is thus a reflective and correflective subcategory of Prox. This means that 
AW5 - S = TW5 holds for every poset S , so that posets are exactly the proxets where the proximities are evaluated 
only in or 1 ; and that AA and TA are respectively the initial and the final poset induced by the proxet A, as witnessed 
by the obvious morphisms WYA — > A ^ WAA. The same universal properties extend to a correspondence between 
intensional proxets and preorders. 

A different family of examples is induced by metric spaces: any metric space X with a distance map d : X xX ^ 
[0, oo] can be viewed as a proxet with the proximity map 

(jcl-y) = 2-''^''^^ (4) 

Proxets are thus a common generalization of posets and metric spaces. But the usual metric distances are symmetric, 
i.e. satisfy d{x,y) = d(y, x), whereas the proximities need not be. The inverse of (|4) maps any proximity to a quasi- 
metric d{x,y) - - log {x v- y) fS^l, whereas intensional proximities induce /?5eM(io-quasi-metrics \TT\. For a concrete 
family of examples of quasi-metrics, take any family of sets X c 'pX, and define 

d{x,y) = |3;\x| 

The distance of x and y is thus the number of elements of y that are not in x. This induces the proximity {xV y) - 2~'^^^'. 
If A" is a set of documents, viewed as bags (multisets) of terms, then both constructions can be generalized to count 
the difference in the numbers of the occurrences of terms in documents, and the set difference becomes multiset 
subtraction. 

Proximity or distance? The isomorphism - logx : [0, 1] <:± [0, oo] : 2 * is easily seen to lift to an isomorphism 
between the category of proxets, as categories enriched over the multiplicative monoid [0, 1] and the category of 
generalized metric spaces, as categories enriched over the additive monoid [0, oo]. Categorical studies of generalized 
metric spaces were initiated in [25], continued in denotational semantics of programming languages [34, 6, 24], and 
have recently turned out to be useful for quantitative distinctions in ecology ETl . The technical results of this paper 
could equivalently be stated in the framework of generalized metric spaces. While this would have an advantage of 
familiarity to certain communities, the geometric intuitions that come with metrics turn out to be misleading when 
imposed on the applications that are of interest here. The lifting of infima and suprema is fairly easy from posets 
to proxets, but leads to mysterious looking operations over metrics. In any case, the universal properties of matrix 
decompositions do not seem to have been studied in either framework so far. 

2.2 Derived proxets and notations 

Any proxets A, B give rise to other proxets by following standard constructions: 

• the dual (or opposite) proxet A, with the same underlying set and the proximity (x )j = (y h x)^; 

• the product proxet Ax B over the cartesian product of the underlying sets, and the proximity (x, u)-y,v = 
(x h y )4 A (m h V )g 

• the power proxet B^ over the monotone maps, i.e. Prox(A, B) as the underlying set, with the proximity 

There are natural correspondences of proxet morphisms 

Prox(A,B)xProx(A,C) = Prox(A,BxC) and Prox(A x B, C) = Prox(A, C*) 

Notations. In any proxet A, it is often convenient to abbreviate {x\-y)^ - \ to x < y. For /, g : A — > B, it is easy to 

A 

see that / < g if and only if fx < gx for all x e A. 

B'* B 
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3 Vectors, limits, adjunctions 

3.1 Upper and lower vectors 

Having generalized posets to proxets, we proceed to lift the concepts of the least upper bound and the greatest lower 
bound. Let (S, C) be a poset and let L, U Q S he a lower set and an upper set, respectively, in the sense that 

(xQy and y e L) ^ x e L (x e U and jc C y) => y e t/ 

Then an element denoted |J ^ is supremum of L, and fl ^ is the infimum of U, if all x,y e A satisfy 

\JL<y <;=> \/x. (xeL^ xCy) (5) 
x<\~]U ^ "iy.iyeU^xQy) (6) 

We generalize these definitions to proxet limits in (|7]|8). To generalize the lower sets, over which the suprema are 
taken, and the upper sets for infima, observe that any upper set f/ c 5 corresponds to a monotone map U : S ^{0,1}, 
whereas every lower set L corresponds to an antitone map L : 5 — » {0, 1 }, where S is the dual proxet defined in 



Sec. 

Definition 3.1 An upper and a lower vector in a proxet A are the monotone maps 17 : A — > [0, 1] and A : A ^ [0, 1]. 

The sets of vectors "flA — [0, 1]'* and U,A — [0, l]^ form proxets, with the proximities computed in terms of the infima 
in [0, 1], as 

Remark. Note that the defining condition for upper vectors {x i- y) < {v ^ i-l?,,), and the defining condition for 
lower vectors (x h y) < ^/l v h are respectively equivalent with 



V ^ ■ {x\- y) < V V and {x \- y) ■ Ay < A^ 



3.2 Limits 



Definition 3.2 The upper limit or supremum ]J A of the lower vector A and the lower limit or infimum n v of the 
upper vector~u are the elements of A that satisfy for every x,y & A 

(u^Hy) = /\Xh(xhy)^ (7) 

^ xeA 

[x^YVv)^ = l\-i!y^{x^y)A (8) 

The proxet A is complete under infima (resp. suprema) if every upper (resp. lower) vector has an infimum (resp. 
supremum), which thus yield the operations Yl '■ ^\A — > A and \J : U.A — > A 

Remarks. Condition ^ generalizes (|5]l, whereas (|8) generalizes (|6). Note how proximity operation t- over [0,1], 
defined in ([T]), plays in (|7]-[8]l the role that the implication over {0, 1) played in (HHSI. This is justified by the fact 
that h is adjoint to the multiplication in [0, 1], in the sense of (|2|, in the same sense in which is adjoint to the meet 
in {0, 1}, or in any Hey ting algebra, in the sense of (x A y) < z <=^ x < (y => z). 

An element w of a poset S is an upper bound of L c S if it satisfies just one direction of (|5]l, i.e. 

(w C y) => Vx. (x e L => X C y) 
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Ditto for the lower bounds. In a proxet A, u is an upper bound of A and ^ is a lower bound of it if all x,y e A satisfy 

{uhy)A< l\A,h{xhy\ and ix^ e)^ < , h {xhy\ 

Using ^ and instantiating y to m in the first inequality, and x to ^ in the second one, these conditions can be shown to 
be equivalent with A ^ < {x \- u)^^ and v y < h y )^, which characterize the upper and the lower bounds in proxets. 

3.3 Completions 

Each element a of a proxet A induces two representable vectors 

Afl : A ^ [0, 1] Va : A ^ [0,1] 

X (a h X )^ X (x h a )^ 

It is easy to see that these maps induce proximity morphisms A : A — > ^A and V : A — > JJ,A, which correspond to the 
categorical Yoneda embeddings ||29. Sec. III. 2]. They make ff A into the lower completion, and U, A into the upper 
completion of the proxet A. 

Proposition 3.3 ft A is upper complete and JJ.A is lower complete. Moreover, they are universal, in the sense that 

• any monotone f : A ^ C into a complete proxet C induces a unique Y\-preserving morphism f# : ^A — > C such 
thatf ^f#oA; 

• any monotone g : A ^ D into a cocomplete proxet D induces a unique \J-preserving morphism g* : JJ. A — » D 
such that g — g* 

flA U 




3.4 Adjunctions 

Proposition 3.4 For any proximity morphism f : A —> B holds (a) <=^ (b) <=> (c) and (d) <=^ (e) <=> (/), 
where 

(a) /(u^) = U/(^) 

(b) 3/. : B ^ A Vx e A € B. (fx ky)^ = (x h/,y)^ 

(c) 3f,:B^AAdA<UAff,<idB 

(d) f{U^)^Uf{'^) 

(e) 3/* : B ^ A Vx e A Vy e B. (/> hx)^ = (y h/x)^ 
(/; 3/* : B A. /*/ < id^ A idfi < //* 

The morphisms f* and /* are unique, whenever they exist. 

Definition 3.5 An upper adjoint is a proximity morphism satisfying (a-c) of Prop. 15.41 a lower adjoint satisfies (d-f). 
A (proximity) adjunction between proxets A and B is a pair of proximity morphisms f*:A'^B:f related as in (b-c) 
and ( e-f). 
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3.5 Projectors and nuclei 

Proposition 3.6 For any adjunction f*:A'^B:f, holds (a) (b) and (c) (d), where 

(a) Vxy e B. {f,x h/.y)^ = (x i-y)^ 

(b) rf, = idfi 

(c) VxyeA. {fxy f''y)B^{xyy)A 

(d) /./• = id^ 



Definition 3.7 An adjunction satisfying (a-b) of Prop, \3.6\ is an upper projector; an adjunction satisfying (c-d) is a 
lower projector. The upper (resp. lower) component of an upper (resp. lower) projector is called the upper (lower) 
projection. The other component (i.e. the one in (a), resp. (c)) is called the upper (lower) embedding. 

Proposition 3.8 Any upper (lower) adjoint factors, uniquely up to isomorphism, through an upper (lower) projection 
followed by an upper (lower) embedding through the proxet 

If^ = {{x,y)eAxB\rx^yAx^f,y] 

Definition 3.9 A nucleus of the adjunction P \ A B \ consists of a proxet 1f^ together with 

• embeddings A If) '-^ B 

• projections A -» (f) «- B 
such that f* — e*p* and ft — e^p^. 

3.6 Cones and cuts 

The cone operations are the proximity morphisms A* and V# 




These morphisms are induced by the universal properties of the Yoneda embeddings V and A as completions, stated 
in Prop. 13.31 Since by definition A* preserves suprema, and V# preserves infima, Prop. I3.4] implied that each of them 
is an adjoint, and it is not hard to see that they form the adjunction A* : liA i:± f|A : V#. Spelling them out yields 

^A*!rj = X: h (JC H fl) {^*^)a ^ /K^-' I" ('^ I" 

" .\eA xeA 

Intuitively, is the proximity of /I to a as its upper bound, as discussed in Sec. 13.21 Visually, ^^"'^j thus 

measures the cone from A to a, whereas (v#l; ) a measures the cone from a to Ij. 

Proposition 3.10 For every tf e ji A every T» e fl" A holds 

!r<V#A*!r and tf > V#A*!r <^ 31; . = aV 
Ij < A*V#T' and Ij > A*V#lt stf. 1; = V#!r 
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The transpositions make the following subproxets isomorphic 




■y e ( 1TA)a«Y|j such that y — V#y and y — A*y a cut y — (y , y) in proxet A. Cuts form a proxet D:A, isomorphic 
with ( U,A)vj,A* <^nd ( '|1'A)a#Vj,, with the proximity 



Lemma 3.12 The $A-infima are constructed in U, A, and ttA suprema are constructed in ff A. 
Corollary 3.13 A proxet A has all suprema if and only if it has all infima. 

Dedekind-MacNeille completion is a special case. If A is a poset, viewed by ^ as the proxet WA, then tl: WA is 
the Dedekind-MacNeille completion of A [28 1. The above construction extends the Dedekind-MacNeille completion 
to the more general framework of proxets, in the sense that it satisfies in the universal property of the Dedekind- 
MacNeille completion [21. The construction seems to be novel in the familiar frameworks of metric and quasi-metric 
spaces. However, Quantitative Concept Analysis requires that we lift this construction to matrices. 

4 Proximity matrices and their decomposition 
4.1 Definitions, connections 

Definition 4.1 A proximity matrix O from proxet A to proxet B is a vector O : A x B ^ [0, 1]. We write it as 
O ; A B, and write its value ^{x,y) at x € A andy e B in the form {x\=y)t^. The matrix composition of ^ : A B 
and ^' : B C is defined 



With this composition and the identity matrices Id^ : A X A — > [0, 1] where Id^ix, x') — (x h x' proxets and proxet 
matrices form the category Matr. 

Remark. Note that the defining condition (u \- x) ■ (y b v) < {(x ky)^ ^ (u\= v)^), which says that O is a proximity 
morphism A x B ^ [0, 1], can be equivalently written 




{u\- x)- (x\=y)^ -iyi-v) < (m h v)o 



(9) 



Definition 4.2 The dual : B A of a matrix O : A B has the entries 




ueA 
veB 



A matrix O : A t-* B where O*^ - O is called a suspension. 
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Remarks. It is easy to see by Prop. 13.101 that (x \-y)^ < (x \-y)ij,±t holds for all x e A and yeB, and that O is a 
suspension if and only if there is some ^' . B i-> A such that O = ^P^. It is easy to see that O <*!*=> <!) ' > M^ '", and 
thus <1) < implies = O^**. 

Definition 4.3 The matrices <1> : A B and ^' . B A form a connection 
0;'P<IdAW*P;<D<IdB. 

Proposition 4.4 <E> : A ■»-> Z? and : B i^ A always form a connection. 

Definition 4.5 A matrix O : A t-^ B /i embedding if(t> ; <I>* = Id^; and a projection i/O* ; O - Mb- 
Definition 4.6 A decomposition of a matrix O : A ■»-> B consists of a proxet D, with 

• projection matrix P : A D, i.e. (d \- d')^, — V.reA (d \=x)p± ■ (x\= d')p, 

• embedding matrix E . D B, i.e. id h d' )d — VveB (d\=y)E ' (y N d')E*' 
such that <i> — P; E, i.e. (x\=y)ij, — VrfeD \= d)p ■ (d\=y)^. 



Matrices as adjunctions. A matrix O : A ■»-> B can be equivalently presented as either of the proximity morphisms 
O. and <^', which extend to <!>» and O* using Thm. 13.31 



AxB^ [0, 1] 



<s>. 

A ^ TTB 



B — > liA 



liA -^iTfi 



* xeA yeB 



Both extensions, and their nucleus, are summarized in diagram (fTTT l. 

V 




The adjunction : ||A ffB : means that 



(10) 



(11) 



holds. The other way around, it can be shown that any adjunction between JJ.A and "ft B is completely determined by 
the induced matrix from A to B. 

Proposition 4.7 The matrices O e Matr(A, B) are in a bijective correspondence with the adjunctions O* : JJ.A ^ "flB : 
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4.2 Matrix decomposition through nucleus 

Prop. I3.10l readilv lifts to matrices. 

Proposition 4.8 For every "a e\\.A every ~p e'^B holds 

&<'l).'D*'a and ^ > (D,(D*^ <=> 3"^ € H B. ^ = O*"^ 
"is < <1)*<D,"^ and ^>(D*(D,^ <^ 3^€|iA."^ = 0.^ 

The adjunction <!)* : A ^ B : induces the isomorphisms between the following proxets 

^O^A = e I = O.O* a"} 

w;f/! f/ie proximity 

Definition 4.9 ^O] w called the nucleus of the matrix O. Its elements are the O-cuts. 
Tlieorem 4.10 The matrix O : A ■»-> B decomposes through ^(tj into 

• the projection P* : A with {x f= ( cF, jS ) j = cTj, anc/ 

• f/ze embedding E* : ^O] ■»-» B with ^(cF, y6) Njj = jSy 

4.3 Universal properties 

Any proxet morphism f : A B induces two matrices, Q/ : A i-> B and U/ : B "i-^ A with 

(x\=y)nf = (fx^y)B (y^x)-ijf = (y\-fx)g 

Definition 4.11 A proximity matrix morphism /rom a matrix O : Fq Fi to F : G{) Gi consists of pair of 
monotone maps ho : Fq — > Go and hi : Fi — > Gi such that 

• Q/zo;r = O;^/!!, 

• h(i preserves any \J that may exist in Fq, 

• h\ preserves any Yl that may exist in Fi. 

Let MMat denote the category of proxet matrices and matrix morphisms. Let CMat denote the full subcategory spanned 
by proximity matrices between complete proxets. 

Proposition 4.12 CMat is reflective in MMat along ^-^ : MMat ^ CMat : U 
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Posets and FCA. If A and B are posets, a {0, 1 )-valued proxet matrix ^ : A B can be viewed as a subposet 
<S> cAxB, lower closed in A and upper closed in B. The adjunction O* : A ^ B : is the Galois connection induced 
by O, and the posetal nucleus is now the complete lattice such that 

V 

• A — ^ JJ.A -» l<i>^ is V-generating and A-preserving, 

V 

• B ^ '\\ B ^ is A-generating and V-preserving. 

WhenA and B are discrete posets, i.e. with all elements incomparable, then any binary relation /? c AxZ? can be viewed 
as a proxet matrix between them. Restricting to the vectors that take their values in and 1 yields U. A = (^A, c) 
and ft Z? s (pB, 2). The concept lattice of FCA then arises from the Galois connection 7?* : JJ. A iz> f]" B ; as 
the concept lattice "[R^. Restricted to {0, 1 [-valued matrices between discrete sets A and B, Prop. 14.121 thus yields a 
universal construction of a lattice V-generated by A and A-generated by B. The FCA concept lattice derived from a 
context cD is thus its posetal nucleus ^O]. This universal property is closely related with the methods and results of 

mM. 

Lifting The Basic Tlieorem of FCA. The Basic Theorem of FCA says that every complete lattice can be realized as 
a concept lattice, namely the the one induced by the context of its own partial order For quantitative concept analysis, 
this is an immediate consequence of Prop 14. 12] which implies a proxet A is complete if and only if Idyi - ^Idyi]. 
Intuitively, this just says that nucleus, as a completion, preserves the structure that it completes, and must therefore 
be idempotent, as familiar from the Dedekind-MacNeille construction. It should be noted that this property does not 
generalize beyond proxets. 



5 Representable concepts and their proximities 
5.1 Decomposition without completion 

The problem with factoring matrices <E> : A ■»-> B through ^Oj in practice is that ^O] is a large, always infinite structure. 
The proxet ^Oj is the completion of the matrix O : A B in the sense that it is 

• the subproxet of the ]J -completion U,A of A, spanned by the vectors cF - 0*0* cF, 

• the subproxet of the H-completion ffB of B, spanned by the vectors j0 = 0*0*/? . 

Since there are always uncountably many lower and upper vectors, and the completions JJ,A and ff B are infinite, ^O] 
follows suit. But can we extract a small set of generators of ^Oj, still supporting a decomposition of the matrix O. 

Definition 5.1 The representable concepts induced ^ are the elements of the completion ^Oj induced the representable 
vectors, i.e. 

• lower representable concepts VO = {(O.O* Va, O* Va) | a e A} 

• upper representable concepts AO = {(O.A/?, 0*0,Afe> \ b & B) 

• representable concepts OO - VO U AO 



Notation. The elements of OO are written in the form Ox - {Ox, Ox), and thus 

Oa = 0.0*Va Ofl = 0*Va 

Ob = O.A/7 0^ = O'O.A/7 
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Theorem 5.2 For any proxet matrix O : A '»-^ B, the restriction of the decomposition A ■»-» B from Thm. \4.10\ 

p E 

along the inclusion 0*5 ^ to the representable concepts yields a decomposition A <><t> B which still satisfies 



Def. \4.6\ More precisely, the matrices 

• P : A X OO '-^ A X [0, 1] 

• £:0OxB'-^^xB-^[0, 1] 

are such that P : A <)<S> is a projection, E : B is an embedding, and P,E — (S>. 



5.2 Computing proximities of representable concepts 

To apply these constructions to the ratings matrix from Sec. [1] we first express the star ratings as numbers between 
and 1. 





n 


C 


/ 


b 


a 


4 

5 


1 


2 
5 


4 

5 


d 


2 
5 


2 

5 


4 
5 


1 


s 


2 
5 


1 


i 
5 


2 
5 


t 


1 


i 


J 


4 


5 


5 


5 


5 


I 


1 


1 
5 


1 

5 


2 
5 



where we also abbreviated the user names to U - {A, D,S,T, L] and the item names to / = {n, c, /, b). Now we can 
compute the representable concepts <>ip e according to Def. 15. II using ( fTOl i: 



{Ou)j 



/\(Aj)[t(u\=i) 



= («l=7) 



(0«)v 



Since 0^ - O.Oi^ and (i>*<)(p - <>(p, it suffices to compute one component of each pair <>(p - {<)(p, <>(p), say the first one. 
So we get 



tn^il i f i 1) ^=(1 i 1 f i) 

The proximities between all representable concepts can now be computed in the form 

(■*'-3')<><t = i<>xi-<>y)^^ = /\(>Xui-<>yu 



2 4 3 3 i_i 

5 5 5 5 5J 



2 i il 

5 2 4) 



2 2 _ 
5 5 5 



since the proximity in is just the proximity in VO, which is a subproxet ot U, U, so its proximity is by Def. 13. li the 
pointwise minimum. Hence 
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h 


n 


c 


i 










f 


/ 


n 


1 


\ 
5 


5 


\ 
5 


5 


4 


5 


3 


1 




J 


1 


— 2 — 


2 


5 


2 


2 


J 


1 


c 


3 


5 


5 


\2 


5 


3 


2 


3 


i 


1 


1 

i 


1 


^ 


s 

J 


T 

Z 


1 


c 

J 


1 




3 


2 




12 


3 


2 


6 


3 


b 


1 


T 

Z 


T 

i 


1 


5 

J 


5 

J 


T 

Z 


^ 


1 

i 


4 


5 


2 


16 


8 


5 


3 


4 


a 


^ 

5 


1 


— 2 — 

5 




5 


1 


\ 

2 


2 

3 


2 

3 


^ 

5 


d 


'1 
5 


1 
5 


4 
5 


1 


2 
5 


1 


2 
5 


2 
3 


2 
5 


s 


2 

5 


1 


5 


2 
5 


1 

2 


2 
5 


1 


1 

2 


2 
5 


t 


1 
5 


H 
5 


3 
5 


4 
5 


1 
4 


1 

2 


1 
2 


1 


1 
5 


I 


1 


1 
5 


1 
5 


2 
5 


1 

5 


1 

4 


1 
5 


1 

3 


1 



The bottom five rows of this table display the values of the representable concepts themselves 

(«i-;)oo = («t=;)o 



(12) 
(13) 



hOx ) 



Ojcu follows from the general fact that 



/Ifl. The upper 



for u,v e U and j e 7, because 
four rows display the values 

O'i-^)«o = /\ (-«l=;)o H (x|=^)o (14) 
O'i-«)oo = /\ (.xN ;')<!> I- (x i-m)«o = /\(m|=0o I- O'i-^)oo (15) 

.vef fey 

for u e U and j, A; e 7. Intuitively, these equations can be interpreted as follows: 

• (113b the proximity (u h v) measures how well (v |= ^ approximates (u |= ^): 

- m's liking (u |= of any movie { is at least (m h v) • (v |= £). 

• (I14l l f/ie proximity (j h ^) measures how well (x |= y) approximates (x |= A:) 

- any user x's rating (jc |= k) is at least (x |= j) • (j h k), 

• (115b f/ie proximity (j h m) measures how well j's style approximates u's taste 

- any x's proximity (x h m) to m is at least (x |= j) ■ (j h m), 

- /s proximity (j h /') to any { is at least (j h m) • (m |= ^). 

Since («!-/) = |, it would make sense for Abby to accept Luka's recommendations, but not the other way around, 
since (I \- a) - i. Although Temra's rating of "Ikiru" is just (t \- i) - |, "Ikiru" is a good test of her taste, since her 
rating of it is close to both Dusko's and Stefan's ratings. 

Latent concepts? While the proximities between each pair of users and items, i.e. between the induced repre- 
sentable concepts, provide an interesting new view on their relations, the task of determining the latent concepts 
remains ahead. What are the dominant tastes around which the users coalesce? What are the dominant styles that 
connect the items? What will such concepts look like? Formally, a dominant concept is a highly biased cut: in a 
high proximity of some of the representable concepts, and distant from the others. One way to find such cuts is to 
define the concepts of cohesion and adhesion of a cut along the lines of fSCTl, and solve the corresponding optimization 
problems. Although there is no space to expand the idea in the present paper, some of the latent concepts can be 
recognized akeady by inspection of the above proximity table (recalling that each cut is both a supremum of users' 
and an infimum of items' representations). 
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6 Discussion and future work 



What has been achieved? We generalized posets to proxets in Sec. |2] and [3] and lifted in Sec. |4] the FCA concept 
lattice construction to the corresponding construction over proxets, that allow capturing quantitative information. Both 
constructions share the same universal property, captured by the nucleus functor in Sec. 14.31 In both cases, the concepts 
are captured by cuts, echoing Dedekind's construction of the reals, and MacNeille's minimal completion of a poset. 
But while finite contexts yield finite concept lattices in FCA, in our analysis they yield infinitely many quantitative 
concepts. This is a consequence of introducing the infinite set of quantities [0,1]. The same phenomenon occurs in 
LSA [10|, which allows the entire real line of quantities, and the finite sets of users and items span real vector spaces, 
that play the same role as our proxet completions. The good news is that the infinite vector space of latent concepts 
in LSA comes with a canonical basis of finitely many singular vectors, and that our proxet of latent concepts also has 
a finite generator, spelled out in Sec. |5] The bad news is that the generator described there is not a canonical basis of 
dominant latent concepts, with the suitable extremal properties, but an ad hoc basis determined by the given sets of 
users and items. Due to a lack of space, the final step of the analysis, finding the basis of dominant latent concepts, 
had to be left for a future paper This task can be reduced to some familiar optimization problems. 

More interestingly, and perhaps more effectively, this task can also addressed using qualitative FCA and its concept 
scaling methods lil3J . The most effective form of concept analysis may thus very well be a combination of quantitative 
and qualitative analysis tools. Our analysis of the numeric matrix, extracted from the given star ratings, should be 
supplemented by standard FCA analyses of a family of relational contexts scaled by various thresholds. We conjecture 
that the resulting relational concepts will be the projections of the dominant latent concepts arising from quantitative 
analysis. If that is the case, then the relational concepts can be used to guide computation of quantitative concepts. 

This view of the quantitative and the qualitative concept analyses as parts of a putative general FCA toolkit raises 
an interesting question of their relation with LSA and the spectral methods of concept analysis IfTOl [Tl. which seem 
different. Some preliminary discussions on this question can be found in ll3n[32l . While FCA captures a particle view 
of network traffic, where the shortest path determines the proximity of two network nodes, LSA corresponds to the 
wave view of the traffic, where the proximity increases with the number of paths. Different application domains seem 
to justify different views, and call for a broad view of all concept mining methods as parts of the same general toolkit. 

Acknowledgements. Anonymous reviewers' suggestions helped me to improve the paper, and to overcome some 
of my initial ignorance about the FCA literature. 1 am particularly grateful to Dmitry Ignatov, who steered the re- 
viewing process with a remarkable patience and tact. 1 hope that my work will justify the enlightened support, that I 
encountered in these first contacts with the FCA community. 
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